rank | frequency | n-gram |
---|---|---|
1 | 7926 | -s |
2 | 6005 | -a |
3 | 5669 | -o |
4 | 4214 | -e |
5 | 3560 | -n |
rank | frequency | n-gram |
---|---|---|
1 | 2644 | -os |
2 | 2323 | -as |
3 | 2208 | -es |
4 | 1651 | -do |
5 | 1486 | -on |
rank | frequency | n-gram |
---|---|---|
1 | 928 | -çon |
2 | 774 | -nte |
3 | 753 | -ado |
4 | 666 | --se |
5 | 647 | -dos |
rank | frequency | n-gram |
---|---|---|
1 | 664 | -açon |
2 | 624 | -ente |
3 | 480 | -ones |
4 | 437 | -ados |
5 | 337 | -dade |
rank | frequency | n-gram |
---|---|---|
1 | 434 | -mente |
2 | 307 | -çones |
3 | 290 | -idade |
4 | 243 | -iento |
5 | 143 | -éncia |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings